3574 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
English Spanish
Availability:
Freely Available
License:
Open Source
Size:
52000000 sentences Production Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Muhammad Imran | Qatar Computing Research Institute | QA |
| Author 2 | Prasenjit Mitra | Qatar Computing Research Institute | QA |
| Author 3 | Carlos Castillo | Sapienza University of Rome | IT |
| Main Contact | Muhammad Imran | Qatar Computing Research Institute | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
Creative Commons Attribution-NonCommercial 4.0 International
Size:
63565 sentences Production Status:
Newly created-in progress
Use:
Information Extraction, Information Retrieval
-
Paper title:The OpenCourseWare Metadiscourse (OCWMD) Corpus
-
Paper track:Evaluation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Ghada Alharbi | University of Sheffield | GB |
| Author 2 | Thomas Hain | University of Sheffield | GB |
| Main Contact | Ghada Alharbi | University of Sheffield | None |
Documentation:
yes, there is a READEME file with the corpus.
Written
Word Sense Disambiguator,
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-used
Use:
Word Sense Disambiguation
-
Paper title:Improving the Extraction of Clinical Concepts from Clinical Records
-
Paper track:<Not Specified>
-
Paper status:Accept-Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Xiao Fu | University of Manchester | GB |
| Author 2 | Sophia Ananiadou | University of Manchester | GB |
| Main Contact | Xiao Fu | University of Manchester | None |
Documentation:
<Not Specified>
Modality Independent
Lexicon,
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
MIT
Size:
121 MByte Production Status:
Newly created-finished
Use:
Knowledge Discovery/Representation
-
Paper title:Building Concept Graphs from Monolingual Dictionary Entries
-
Paper track:Written
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Gábor Recski | Research institute for Linguistics, Hungarian Academy of Sciences | HU |
| Main Contact | Gábor Recski | Research institute for Linguistics, Hungarian Academy of Sciences | None |
Documentation:
https://github.com/kornai/4lang/blob/master/README.md
Written
Corpus,
Language Type:
Trilingual
Languages:
English German french
Availability:
Freely Available
License:
<Not Specified>
Size:
225 novels OtherProduction Status:
Existing-used
Use:
Evaluation/Validation
-
Paper title:Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
-
Paper track:Evaluation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Thomas Proisl | FAU Erlangen-Nürnberg | DE |
| Author 2 | Stefan Evert | FAU Erlangen-Nürnberg | DE |
| Author 3 | Fotis Jannidis | Julius-Maximilians-Universität Würzburg | DE |
| Author 4 | Christof Schöch | University of Wuerzburg | DE |
| Author 5 | Leonard Konle | Julius-Maximilians-Universität Würzburg | DE |
| Author 6 | Steffen Pielström | Julius-Maximilians-Universität Würzburg | DE |
| Main Contact | Stefan Evert | FAU Erlangen-Nürnberg | None |
Documentation:
<Not Specified>
Not Applicable
machine learning classifier,
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
http://www.csie.ntu.edu.tw/~cjlin/liblinear/
Size:
828 <Not Specified>Production Status:
Existing-used
Use:
Document Classification, Text categorisation
-
Paper title:A corpus of general and specific sentences from news
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Annie Louis | University of Pennsylvania | None |
| Author 2 | Ani Nenkova | University of Pennsylvania | None |
| Main Contact | Annie Louis | University of Pennsylvania | US |
Documentation:
yes, english, http://www.csie.ntu.edu.tw/~cjlin/liblinear/Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
<Not Specified>
Size:
1.6 <Not Specified>Production Status:
Existing-updated
Use:
Emotion Recognition/Generation
-
Paper title:Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text
-
Paper track:Evaluation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Alexandra Balahur | <Not Specified> | None |
| Author 2 | Jesús M. Hermida | <Not Specified> | None |
| Main Contact | Alexandra Balahur | European Commission Joint Research Centre | IT |
Documentation:
http://doi.ieeecomputersociety.org/10.1109/T-AFFC.2011.33
Written
Coreference Resolution,
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
GNU
Size:
<Not Specified> Production Status:
Existing-updated
Use:
Coreference Resolution
-
Paper title:Corpus for Coreference Resolution on Scientific Papers
-
Paper track:Evaluation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Panot Chaimongkol | The University of Tokyo | JP |
| Author 2 | Akiko Aizawa | National Institute of Informatics; The University of Tokyo | JP |
| Author 3 | Yuka Tateisi | National Institute of Informatics | JP |
| Main Contact | Panot Chaimongkol | The University of Tokyo | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
English
Availability:
From Data Center(s)
License:
<Not Specified>
Size:
4580000000 tokensProduction Status:
Existing-used
Use:
Text Mining
-
Paper title:Unsupervised Multiword Segmentation of Large Corpora using Prediction-Driven Decomposition of n-grams
-
Paper track:Software, Tools
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Julian Brooke | University of Melbourne | AU | University of Toronto | AU |
| Author 2 | Vivian Tsang | Quillsoft Ltd. | CA | ||
| Author 3 | Graeme Hirst | University of Toronto | CA | ||
| Author 4 | Fraser Shein | Quillsoft Ltd. | None | ||
| Main Contact | Julian Brooke | University of Melbourne | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
English Standard Arabic
Availability:
Freely Available
License:
OpenSource
Size:
100.1 MByte Production Status:
Existing-updated
Use:
Corpus Creation/Annotation
-
Paper title:OSMAN – A Novel Arabic Readability Metric
-
Paper track:Evaluation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Mahmoud El-Haj | Lancaster University | GB |
| Author 2 | Paul Rayson | Lancaster University | GB |
| Main Contact | Mahmoud El-Haj | Lancaster University | None |
Documentation:
<Not Specified>




